home *** CD-ROM | disk | FTP | other *** search
Python egg package info | 2009-06-03 | 9.2 KB | 245 lines |
- Metadata-Version: 1.0
- Name: PyICU
- Version: 0.8.1
- Summary: Python extension wrapping the ICU C++ API
- Home-page: http://pyicu.osafoundation.org/
- Author: Open Source Applications Foundation
- Author-email: UNKNOWN
- License: UNKNOWN
- Description:
-
- README file for PyICU
- ---------------------
-
- Contents
- --------
-
- - Welcome
- - Building PyICU
- - Running PyICU
- - What's available
- - API Documentation
-
-
- Welcome
- -------
-
- Welcome to PyICU, a Python extension wrapping IBM's International
- Components for Unicode C++ library (ICU).
-
- PyICU is a project maintained by the Open Source Applications Foundation.
-
- IBM's ICU homepage is: http://www-306.ibm.com/software/globalization/icu/
-
-
- Building PyICU
- --------------
-
- Before building PyICU the ICU 3.6 or 3.8 libraries must be built and
- installed. Refer to each system's instructions for more information.
-
- As of version 0.5 PyICU no longer uses SWIG.
-
- As of version 0.8 PyICU is built with distutils or setuptools:
- - verify that the INCLUDES, LFLAGS, CFLAGS and LIBRARIES dictionaries in
- setup.py contain correct values for your platform
- - python setup.py build
- - sudo python setup.py install
-
-
- Running PyICU
- -------------
-
- . Mac OS X
- Make sure that DYLD_LIBRARY_PATH contains paths to the directory(ies)
- containing the ICU libs.
-
- . Linux
- Make sure that LD_LIBRARY_PATH contains paths to the directory(ies)
- containing the ICU libs or that you added the corresponding -rpath
- argument to LFLAGS.
-
- . Windows
- Make sure that PATH contains paths to the directory(ies)
- containing the ICU DLLs.
-
-
- What's available
- ----------------
-
- PyICU is under active development. Currently, the string, locale, format,
- calendar, timezone, charset and various iterator classes are available.
- See the CHANGES file for an up to date log of changes and additions.
-
-
- API Documentation
- -----------------
-
- At the moment, there is no API documentation for PyICU. The API for ICU is
- documented at http://icu.sourceforge.net/apiref/icu4c/ and the following
- patterns can be used to translate from the C++ APIs to the corresponding
- Python APIs.
-
- - strings
-
- The ICU string type, UnicodeString, is a type pointing at a mutable
- array of UChar Unicode 16-bit wide characters. The Python unicode type
- is an immutable string of 16-bit or 32-bit wide Unicode characters.
-
- Because of these differences, UnicodeString and Python's unicode type
- are not merged into the same type when crossing the C++ boundary.
- ICU APIs taking UnicodeString arguments have been overloaded to also
- accept Python str or unicode type arguments. In the case of str objects,
- utf-8 encoding is assumed when converting them to UnicodeString
- objects.
-
- To convert a Python str encoded in a encoding other than utf-8 to an ICU
- UnicodeString use the UnicodeString(str, encodingName) constructor.
-
- ICU's C++ APIs accept and return UnicodeString arguments in several
- ways: by value, by pointer or by reference.
- When an ICU C++ API is documented to accept a UnicodeString & parameter,
- it is safe to assume that there are several corresponding PyICU python
- APIs making it accessible in simpler ways:
- For example, the 'UnicodeString &Locale::getDisplayName(UnicodeString &)'
- API, documented here:
- http://icu.sourceforge.net/apiref/icu4c/classLocale.html#a19
- can be invoked from Python in several ways:
-
- 1. The ICU way
-
- >>> from PyICU import UnicodeString, Locale
- >>> locale = Locale('pt_BR')
- >>> string = UnicodeString()
- >>> name = locale.getDisplayName(string)
- >>> name
- <UnicodeString: Portuguese (Brazil)>
- >>> name is string
- True <-- string arg was returned, modified in place
-
- 2. The Python way
-
- >>> from PyICU import Locale
- >>> locale = Locale('pt_BR')
- >>> name = locale.getDisplayName()
- >>> name
- <UnicodeString: Portuguese (Brazil)>
-
- A UnicodeString object was allocated for Python and returned.
-
- A UnicodeString can be coerced to a Python unicode string with Python's
- unicode() constructor. The usual len(), str(), comparison, [] and [:]
- operators are all available, with the additional twists that slicing is
- not read-only and that += is also available since a UnicodeString is
- mutable. For example:
-
- >>> name = locale.getDisplayName()
- <UnicodeString: Portuguese (Brazil)>
- >>> unicode(name)
- u'Portuguese (Brazil)'
- >>> len(name)
- 19
- >>> str(name) <-- works when chars fit with default encoding
- 'Portuguese (Brazil)'
- >>> name[3]
- u't'
- >>> name[12:18]
- <UnicodeString: Brazil>
- >>> name[12:18] = 'the country of Brasil'
- >>> name
- <UnicodeString: Portuguese (the country of Brasil)>
- >>> name += ' oh joy'
- >>> name
- <UnicodeString: Portuguese (the country of Brasil) oh joy>
-
- - error reporting
-
- The C++ ICU library does not use C++ exceptions to report errors. ICU
- C++ APIs return errors via a UErrorCode reference argument. All such
- APIs are wrapped by Python APIs that omit this argument and throw an
- ICUError Python exception instead. The same is true for ICU APIs taking
- both a ParseError and a UErrorCode, they are both to be omitted.
-
- For example, the 'UnicodeString &DateFormat::format(const Formattable &,
- UnicodeString &, UErrorCode &)' API, documented here
- http://icu.sourceforge.net/apiref/icu4c/classDateFormat.html#a6
- is invoked from Python with:
-
- >>> from PyICU import DateFormat, Formattable
- >>> df = DateFormat.createInstance()
- >>> df
- <SimpleDateFormat: M/d/yy h:mm a>
- >>> f = Formattable(940284258.0, Formattable.kIsDate)
- >>> df.format(f)
- <UnicodeString: 10/18/99 3:04 PM>
-
- Of course, the simpler 'UnicodeString &DateFormat::format(UDate,
- UnicodeString &)' documented here:
- http://icu.sourceforge.net/apiref/icu4c/classDateFormat.html#a5
- can be used too:
-
- >>> from PyICU import DateFormat
- >>> df = DateFormat.createInstance()
- >>> df
- <SimpleDateFormat: M/d/yy h:mm a>
- >>> df.format(940284258.0)
- <UnicodeString: 10/18/99 3:04 PM>
-
- - dates
-
- ICU uses a double floating point type called UDate that represents the
- number of milliseconds elapsed since 1970-jan-01 UTC for dates.
-
- In Python, the value returned by the time module's time() function is
- the number of seconds since 1970-jan-01 UTC. Because of this difference,
- floating point values are multiplied by 1000 when passed to APIs taking
- UDate and divided by 1000 when returned as UDate.
-
- Python's datetime objects, with or without timezone information, can
- also be used with APIs taking UDate arguments. The datetime objects get
- converted to UDate when crossing into the C++ layer.
-
- - arrays
-
- Many ICU API take array arguments. A list of elements of the array
- element types is to be passed from Python.
-
- - StringEnumeration
-
- An ICU StringEnumeration has three 'next' methods: next() which returns
- a 'str' objects, unext() which returns 'unicode' objects and snext()
- which returns 'UnicodeString' objects.
- Any of these methods can be used as an iterator, using the Python
- built-in 'iter' function.
-
- For example, let e be a StringEnumeration instance:
-
- [s for s in e] is a list of 'str' objects
- [s for s in iter(e.unext, None)] is a list of 'unicode' objects
- [s for s in iter(e.snext, None)] is a list of 'UnicodeString' objects
-
- - timezones
-
- The ICU TimeZone type may be wrapped with an ICUtzinfo type for usage
- with Python's datetime type. For example:
-
- tz = ICUtzinfo(TimeZone.createTimeZone('US/Mountain'))
- datetime.now(tz)
-
- or, even simpler:
-
- tz = ICUtzinfo.getInstance('Pacific/Fiji')
- datetime.now(tz)
-
- To get the default time zone use:
-
- defaultTZ = ICUtzinfo.getDefault()
-
- To get the time zone's id, use the 'tzid' attribute or coerce the time
- zone to a string:
-
- ICUtzinfo.getInstance('Pacific/Fiji').tzid -> 'Pacific/Fiji'
- str(ICUtzinfo.getInstance('Pacific/Fiji')) -> 'Pacific/Fiji'
-
- Platform: UNKNOWN
-